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Abstract 

For generalized Reed-Solomon codes, it has been proved [BJ that the problem of determining 
if a received word is a deep hole is co-NP-complete. The reduction relies on the fact that the 
evaluation set of the code can be exponential in the length of the code - a property that 
practical codes do not usually possess. In this paper, we first presented a much simpler proof 
of the same result. We then consider the problem for standard Reed-Solomon codes, i.e. 
the evaluation set consists of all the nonzero elements in the field. We reduce the problem 
of identifying deep holes to deciding whether an absolutely irreducible hypersurface over a 
finite field contains a rational point whose coordinates are pairwise distinct and nonzero. By 
applying Schmidt and Cafure-Matera estimation of rational points on algebraic varieties, we 
prove that the received vector (/(aO) ae jr f° r Reed-Solomon [q, k] q , k < q 1 / 7 ~ € , cannot be a 

deep hole, whenever f(x) is a polynomial of degree k + d for 1 < d < g 3 / 13 ~ e . 
Keywords: Reed-Solomon codes, deephole, NP-complete, algebraic surface. 

1 Introduction 

A signal, when transfered over a long distance, always has a possibility of being corrupted. 
Error-detecting and error-correcting codes make the modern communication possible. The Reed- 
Solomon codes are very popular in engineering a reliable channel due to their simplicity, burst 
error-correction capabilities, and the powerful decoding algorithms they admit. 

Let F q be the finite field with q elements, where q is a prime power. The encoding process of 
generalized Reed-Solomon codes can be thought of as a map from — > F™ in which a message 
(oj, d2, • ■ ■ , flfc) is mapped to a vector 

(f{xi),f{x 2 ),-- ■ J{x n )), 

where f(x) = af 1 x k ^ 1 + ak-ix k ~ 2 + • • • + a\ £ F q [x] and {x±,X2,--- ,x n } C F q is called the 
evaluation set. (Note that different encoding schemes are possible.) 
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It is not difficult to see that the set of codewords formed in this manner is a linear subspace 
of Fg which has dimension k. Reed-Solomon codes are therefore linear codes, because they are 
linear subspaces of F™, where n is the length of a codeword. 

The Hamming distance between two codewords is the number of coordinates in which they 
differ - or one can think of this as the number of modifications required to transform one vector 
into another. A Hamming ball of radius m is a set of vectors within Hamming distance m to some 
vector in F™. The minimum distance of a code is the smallest distance between any two distinct 
codewords, and is a measure of how many errors the code can correct or detect. The covering 
radius of a code is the maximum possible distance from any vector in F™ to the closest codeword. 
A deep hole is a vector which achieves this maximum. The minimum distance of Reed-Solomon 
codes is n — k + 1. The covering radius of Reed-Solomon codes is n — k. 

A code is useless without a decoding algorithm, which takes some received word (a vector in F") 
and outputs a message. The message should correspond, ideally, to the codeword which is closest, 
with respect to Hamming distance, to the received word. If we assume that each coordinate in a 
received word is equally likely to be in error, then the closest codeword is the most likely to be 
the intended transmission. 

Standard Reed-Solomon codes use F* as their evaluation set. If the evaluation set is F q , then 
the code is called an extended Reed-Solomon code. If the evaluation set is the set of rational 
points in a projective line over F q , then the code is known as a doubly extended Reed-Solomon 
code. In this paper, we consider standard Reed-Solomon codes, all of our results can be easily 
generalized to extended Reed-Solomon codes. The difference between standard Reed-Solomon 
codes, extended Reed-Solomon codes and doubly extended Reed-Solomon codes is not practically 
significant, but generalized Reed-Solomon codes are quite unique, as the evaluation set can be 
exponentially larger than the length of a codeword. 

1.1 Related Work 

The pursuit of efficient decoding algorithms for Reed-Solomon codes has yielded intriguing results. 
If the radius of a Hamming ball centered at some received word is less than half the minimum 
distance, there can be at most one codeword in the Hamming ball. Finding this codeword is 
called unambiguous decoding. It can be efficiently solved, see for a simple algorithm. 

If the radius is less than n — \/n(k — 1), the problem can be solved by the Guruswami-Sudan 
algorithm [Sj, which outputs all the codewords inside a Hamming ball. If the radius is stretched 
further, the number of codewords in a Hamming ball may be exponential. We then study the 
bounded distance decoding problem, which outputs just one codeword in any Hamming ball of a 
certain radius. More importantly, we can remove the restriction on radius and investigate the 
maximum likelihood decoding problem, which is the problem of computing the closest codeword 
to any given vector in F™. 

The question on decodability of Reed-Solomon codes has attracted attention recently, due to 
recent discoveries on the relationship between decoding Reed-Solomon codes and some number 
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theoretical problems. Allowing exponential alphabets, Guruswami and Vardy proved that the 
maximum likelihood decoding is NP-complete. They essentially showed that deciding deep holes 
is co-NP-complete. When the evaluation set is precisely the whole field or F* an NP-completeness 
result is hard to obtain, Cheng and Wan [3] managed to prove that decoding problem of Reed- 
Solomon codes at certain radius is at least as hard as the discrete logarithm problem over finite 
fields. In this paper, we wish to establish an additional connection between decoding of standard 
Reed-Solomon codes and a classical number-theoretic problem - that of determining the number 
of rational points on an algebraic hypersurface. 



1.2 Our Results 

The decoding problem of Reed-Solomon codes can be reformulated into the problem of curve 
fitting or noisy polynomial reconstruction. In this problem, we are given n points 

(xi,yi), (x2,V2), ■■■ , (x n ,y n ) 

in Fg. The goal is to find polynomials of degree k—1 that pass as many of the n points as possible. 
Note that all the ^-coordinates are distinct. 

Given the received word w = (yi, y 2 , ■ ■ • , y n ), we are particularly interested in the polynomial 
obtained by interpolating the n points. 

(x - x 2 )(x - x 3 ) ■ ■ ■ (x - x n ) 

w[x) = yx 



XI - X 2 ){x\ - X 3 ) ■ ■ ■ (X\ - X n ) 

(X - Xi) ■ ■ ■ (x - Xi-l)(x - X i+ i) ■ ■ ■ (x - X n ) 



+ -- + V. 



H h Vn 



(Xi - Xl) ■ ■ ■ (Xi - Xi-i){Xi - Xi+l) ■ ■ ■ (Xl - X n ) 
X - X\)(X - X 2 ) ■ ■ ■ (x - X n -l) 



{Xn X\)(Xfi X 2 ) ' ' ' ix n X n —\ l 



In this paper, we say that a polynomial w(x) generates a vector w G F" if w = (w(xi),w(x 2 ), • • • , w(x n ))- 
If the polynomial w(x) has degree k — 1 or less, w must be a codeword, and vice versa (since 
codewords consist of the encodings of all messages of length k). If w{x) has degree k, w must be 
a deep hole (as we will later show). What if it has degree larger than kl Can it be a deep hole? 

In this paper, we try to answer this question. If a received word is a deephole, there is no 
codeword which is at distance n — k — 1 or closer to the received word. Hence if the distance 
bound is n — k — 1, a decoding algorithm can tell a received word is deephole or not by checking 
whether there is a codeword in the Hamming ball of radius n — k — 1. This shows that maximum 
likelihood decoding of Reed-Solomon codes, as well as the bounded distance decoding at radius 
n — k — 1, is at least as hard as deciding deepholes. Observe that the bounded distance decoding 
at a distance of n — k or more can be done efficiently. It is hoped that we can decrease the radius 
until we reach the domain of hard problems. 
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We are mainly concerned with the case when the evaluation set consists of nonzero elements 
of the field. Notice that for generalized Reed-Solomon code, the bounded distance decoding at 
distance n— k— 1 is NP-hard. We reduce the problem to deciding whether an absolutely irreducible 
hypersurface contains a rational point whose coordinates are pairwise distinct and nonzero. From 
the reduction, we show if k and the degree of w(x) are small, w(x) cannot generate deep holes. 
More precisely 

Theorem 1 Let q be a prime power and 1 < k < q l l 7 ~ e be a positive integer. The vector 
( u; ( Q! )) ae F, ^ s n °t ^ ee P hole in Reed- Solomon code [q,k] q if the degree of w(x) is greater than k 
but less than k + g 3 / 13 ^. 

Roughly speaking, the theorem indicates that a vector generated by a low degree polynomial 
can not be a deephole, even though it is very far away from any codeword. 

To prove the theorem, we need to estimate the number of rational points on an algebraic 
hypersurface over a finite field. This problem is one of the central problems in algebraic geometry 
and finite field theory. Weil, through his proof of the Riemann Hypothesis for function fields, 
provided a bound for the number of points on algebraic curves. This bound was later generalized 
by Weil and Lang to algebraic varieties. Schmidt [7j obtained some better bounds for absolutely 
irreducible hypersurfaces by elementary means. In this paper, we will use his results and an 
improved bound, obtained by Cafure and Matera 0| very recently. But first we give a new proof 
that deciding whether or not a received word is a deep hole is co-NP-complete. Our reduction is 
straight-forward and much simpler that the one constructed by Guruswami and Vardy. 

2 A simple proof that the maximum likelihood decoding is NP- 
complete 

We reduce the following finite field subset sum problem to deep hole problem of generalized 
Reed-Solomon codes. 

Instance: A set of n elements A = {x\, x 2 , x^, • • • , x n } C F2™, an element b € F 2 ™ and a positive 
integer k < n. 

Question: Is there a nonempty subset {xi 1 ,Xi 2 , ■ ■ ■ ,Xi k } C A of cardinality k such that 

Now consider the generalized Reed-Solomon code [n, k\-zm with evaluation set A. Suppose we 
have a received word 

w = (f(.xi),f(x 2 ), ■ ■ ■ J{x n )) 
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where f(x) = x k+1 + bx k . If the word is not a deep hole, it is at most n — k — 1 away from 
a codeword. In the other words, there is a polynomial t{x) of degree k — 1 or less such that 
f(x) + t(x) has at least k + 1 distinct roots in A. Since /(x) + t(x) is a monic polynomial with 
degree k + 1, we have 

/(x) + i(x) = x fe+1 + 6x fc + t(x) = (x - x h )(x - x i2 ) ■■■ (x - x ik+1 ), 

for some Xj x , Xj 2 , • • • , Xj fe in A. Therefore Xj x + Xj 2 + • • • + Xj fc+1 = 6. 

On the other hand, if Xj x + Xj 2 + • • • + Xj fc+1 = 6, /(x) — (x — Xj 1 )(x — Xj 2 ) • • • (x — Xj fe+1 ) generates 
a codeword. It shares k + 1 values with w, thus has distance less than n — k — 1 away from a 
codeword, so it cannot be a deep hole. 

In summary, w is not a deep hole if and only if the answer to the instance of the finite field 
subset sum problem is "Yes" . Hence the deep hole problem is co-NP-complete. 

By a similar argument, we know that the polynomials of degree k must generate deep holes. 
Hence 

Corollary 1 For a generalized Reed-Solomon code [n, k] q , there are at least (q — l)q k many deep 
holes. 

We remark that the argument cannot work for small evaluation sets, because the subset sum 
is easy in that case. Indeed, if the evaluation set is the whole field and k > 1, then a polynomial 
of degree k + 1 cannot generate a deep hole. 

3 A hypersurface related to deep holes 

The above argument motivates us to consider vectors generated by polynomials of larger degree. 
We are given some received word w and we want to know whether or not it is a deep hole. 
The received word is generated by w{x) of the form /(x) + t(x) where /(x) is some polynomial 
containing no terms of degree smaller than k, and t{x) is some polynomial containing only terms 
of degree k — 1 or smaller. For purposes of determining whether or not w is a deep hole, we fix a 
monic /(x) 

f{x) = x k+d + f d ^x k+d ~ l + • • • + f x k € F q [x] 

and let t(x) vary and ask whether /(x) + t{x) has k + 1 roots, or perhaps more. 

In its essence, the question is one of finding a polynomial whose leading terms are /(x), and 
which has as many zeroes as possible over a certain field. (As stated earlier, for k > 1, if /(x) has 
degree k, then w is a deep hole. If /(x) has degree k + 1, then it is not a deep hole.) 

The most obvious way to approach this problem is to symbolically divide /(x) by a polynomial 
that is the product of k + 1 distinct (but unknown) linear factors, and determine whether or not 
it is possible to set the the leading term of the remainder, i.e., the coefficient of x k , equal to 
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zero. If the leading coefficient is 0, the remainder has degree k — 1 or less in x, which generates a 
codeword. The distance between the codeword and w will be at most n — k — 1. 

A polynomial that is the product of k + 1 distinct linear factors will have the elementary 
symmetric polynomials as coefficients, 

II = (x - Xi)(x - X 2 )...(x - Xk+l) = X k+1 + TTiX k + TT 2 X k ~ 1 + ... + 7Tfc+l, 

where 7Tj is the i-th symmetric polynomial in x±, x 2 , ■ ■ ■ , Xfc+i- 

Since II is monic, the remainder of f(x) dividing II will be a polynomial in F q [xi,x 2 , ■ ■ ■ , ajfc+i] [x] 
of degree less than k+1. Denote the leading coefficient of the remainder by Lj j 1; ... j d _ 1 (xi, x 2 , ■ ■ ■ , 
This is a multivariate polynomial of degree d. 

As an example, imagine dividing the polynomial x k+1 by II. We can easily verify that the 
leading term of the remainder is — -ir\x k . Since we can always find k + 1 distinct values that will 
satisfy 7Ti = 0, we know that x k+1 cannot be a deep hole. But, in most cases w(x) will have 
larger degree and contain many terms, and the remainder will be a more complex multivariate 
polynomial in k + 1 variables, rather than a linear polynomial in k + 1 variables. If the leading 
coefficient itself have a solution where all roots are distinct and invertible, then f(x) + t(x) cannot 
generate a deep hole. 

We now argue that the leading coefficient of the remainder is absolutely irreducible. We write 

L fo,h,- Jd-i fci, Z2, ■ • • , Xk+i) = F d + + • • • + F , 

where Fi is a form containing all the terms of degree i in L. The polynomial Lf j u ... j d l (x±, x 2 , • • • , £fc+i) 
is absolutely irreducible if Fd is absolutely irreducible. To see this, suppose that L can be factored 
as L'L". Let F di be the form of highest degree in L' and F^ be the form of highest degree in L" . 
Then we have Fd = F di F d ' 2 , a contradiction to the condition that Fd is absolutely irreducible. 
Fortunately Fd does not depend on /j's. 

Lemma 1 The form of the highest degree in Lf j 1 ... j c _ 1 (xi,x 2 ,- ■ ■ ,£fc+i) is exactly £o,o,- ,o{x\, x 2 , ■■ ■ ,Xk + ±). 

Proof: It can be proved by mathematical induction on c. □ 

In the next section, we argue that the term of highest degree in the leading coefficient, which we 
will call Xd( x i,x 2 , ...Xk+i), is absolutely irreducible. We will actually show that Xd( x i, x 2 , 1,0, 0...0) 
is absolutely irreducible. This is because that Xd(xi, x 2 , 1,0, 0...0) has the same degree as Xd{x\, x 2 , ...Xk+i), 
if the former is irreducible, so is the latter. 

Lemma 2 Xd(xi,x 2 ,l,0,0...0) = E^^x^x^ . 

Proof: We need to compute the leading coefficient of the remainder after dividing x k+d by 
(x — x\)(x — x 2 )(x — l)x k ~ 2 . It is as same as the leading coefficient of the remainder after dividing 
x d+2 (x—xi)(x—x 2 )(x—l). The remainder is a quadratic polynomial in x. When we evaluate it 
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at xi, it takes value x d+2 . When we evaluate it at X2, it takes value x d+2 . When we evaluate it at 
1, it takes value 1. By interpolating, we obtain the unique polynomial satisfying these conditions. 
It is 

x d+2 (X-X 2 )(X-1) ^+2 (X ~ Xl){x - 1) {X-X 1 ){X-X 2 ) 
1 (Xl ~ X 2 ){XI - 1) 2 (x 2 - Xl)(x 2 ~ 1) (1 - Xl)(l ~ X 2 ) 

The leading coefficient is 

d+2 d+2 -, 

x l i x 2 



+ 7 ^ TT + 



(xi - x 2 )(x 1 - 1) (x 2 - xi)(x 2 - 1) (l-xi)(l -x 2 )' 
which is equal to Sj + j<^xi'x 2 ^ . 



□ 



4 A smooth curve 

The section is devoted to the proof of the irreducibility of the bivariate polynomial Ej+j^x 1 ?/- 7 . 

Lemma 3 The curve f(x,y) = Sj+jx^x 1 ^ is absolutely irreducible. 

Proof: To show that f(x,y) = T, i+ j< ( ix' l y^ is absolutely irreducible, we actually prove a 
stronger statement that f(x,y) = is a smooth algebraic curve. It can be verified by simple 
calculation that places on the curve at infinity are nonsingular. Hence it is sufficient to show all 
the affine places on the curve are nonsingular, i.e. that the system of equations 

' f(x,y)=0 



dy- U 



has no solution. 

First, it is convenient to write f(x,y) as 



We write ^ as: 



and |£ as 



x d + (y + l)^- 1 + (y 2 + y + l)x d ' 2 + ... + (y d + ... + 1). 
dx d ' x + (d-l)(y+ l)x d ~ 2 + ... + + ...1) 

dy^ 1 + {d - l)(x + l)y d - 2 + ... + (x^ 1 + ...1) 
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Assume that there is a solution (x,y) to the system of equations. Compute (x — J/)§|: 

X ^L = dx d + {d-l){y + l)x d - 1 + ... + x{y d - 1 + ...1) 
ox 

= (d + l)x d + d(y + l)^" 1 + ••• + (/ + y d - X + • • • + 1) - fix, y) 

y^- = d yx d - 1 + (d-l)(y 2 + y)x d - 2 + ... + (y d + ...l) 

Their difference is: 

(*-V)% = dx d +[d-(y+l)]x d - 1 + ...-(y d + ... + l) 

= (d + l)x d + dx d ~ l + (d - l)x d ~ 2 + ... + 1 - f{x, y). 
Since f(x,y) must be zero, we know that: 

id + l)x d + dx d ~ x + (d - l)x d - 2 + .. + 1 = 
we multiply by the above x, and then subtract the original expression to get: 

id+l)x d+1 = x d + x d - 1 + ... + 1. (1) 
Repeat the process on we get 

id+l)y d+1 =y d + y d ~ 1 + + (2) 
This shows that neither x nor y can be zero. Now, observe that 

(x - y)fix, y) = x d+1 +x d + ■■■ + !- y d+1 - y d 1 = 0. 



This means that (d + 2)x d+1 = id + 2)y d+1 . We also know that the right hand sides of^and[21 
are actually x ^V 1 , and y * - T 1 . So multiplying both sides by x — 1 for ^ and by y — 1 for [3 we 
obtain 

(d + l)x d+2 - (d + 2)x d+1 = 1 

(d+l)y d + 2 -(d + 2)|/ d+1 = 1. 

Hence we have (d + l)x d+2 = (d + l)y d+2 . 

If d + 1 = 0, we have x d + x d-1 H h 1 = and dx d ~ x + (d - l)x d ~ 2 H 1-1 = 0, which is 

as same as 

±(x d + x d - 1 + ->- + l)=0. 
ax 

This means that the equation x d + x d ~ l + • • • + 1 = has a multiple root. Contradiction. 
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If d+2 = 0, we conclude that x d+1 +x d + - • - + 1 = has a multiple root. Again a contradiction. 

Now we can assume that the characteristic of the field does not divide either d + 1 or d + 2. 
In particular, this means that the field must have odd characteristic. We have and 
x d+1 = y d+1 . Therefore x = y. 

In this case, and ^ are the same. We know that f(x,y) can be written (since x = y) as 

(d + l)x d + dx^ 1 + (d- l)x d ~ 2 + ... + 1 = 0. 

If x = y, then 

|£ = (d + (d - 1) + ... + l)^" 1 + ...((d - 1) + (d - 2) + ....)x d " 2 + ... + 1 = 0. 
ax 

If we subtract /(x,y) from we get 

(d + l)x d - ((d - 1) + (d - 2) + • ■ • + l)^" 1 x = 0. 

Divide the result by x, we have 

(d+ 1^- 1 - ((d- 1) + (d- 2) + • • • + l)x d ~ 2 1 = (d+ ljx '- 1 + (d+ (d- 1) + • • • + 1^" 1 = 0. 

This means that ( tj+1 H' j + 2 ) x d ~ l = 0, hence x = 0, and this is a contradiction. 

□ 



5 Estimation of rational points 

Cafure and Matera Theorem 5.2] have obtained the following estimation of number of rational 
points on an absolutely irreducible hypersurface: 

Proposition 1 An absolutely irreducible F q -hypersurface in F™ contains at least 

q n ~ l - (d - l)(d - 2)q n - 3 / 2 - 5d 13/ V~ 2 

many F q -rational points. 

We also use the following proposition, proved by Schmidt [7j. 

Proposition 2 Suppose /i(xi,X2,--- ,x n ) and f?,(x\,X2, ■ ■■ ,x n ) are polynomials of degree not 
greater than d, and they donot have a common factor. Then the number of F q -rational solutions 
of 

h = h = o 

is at most 
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We seek solutions of L(xi,X2, • • • , Zfc+i) but not of 

ni<i<fc+iXjIIi< i< j<fc + i(xi - Xj) = 0. 

We count the number of rational solutions of L(x\,X2, ■ • • , x^+i), minus the number of rational 
solutions of 

f L(xi,x 2 ,-" ,x k +i)=Q 

I ^l<i<k+l x i^-l<i<j<k+l( x i ~ Xj)=0 

The number is greater than 

q k -(d- l)(d - 2)q k - 1 / 2 - ^q^ 1 - 2{k + l)[max(d, ^ + fe + 2 )]V" 1 , 

which is greater than if d < q 3 / 13 ^ t and k < q 1 / 7 ~ e . This concludes the proof of the main 
theorem. 

6 Concluding Remarks 

It has been proved that for generalized Reed-Solomon codes, the bounded distance decoding at 
radius n — k — 1 is NP-hard. In this paper, we try to determine the complexity of this problem 
for standard Reed-Solomon codes. We reduce the problem to a problem of determining whether a 
hypersurface contains a rational point of distinct coordinates. While we didnot solve the problem 
completely, we show that for small k, this problem is easy if a vector is generated by small degree 
polynomial. In essential, we ask whether there exists a polynomial with many distinct rational 
roots under the restriction that some coefficients are prefixed. This problem bears an interesting 
comparison with the active researches |1] on construction of irreducible polynomial with some 
prefixed coefficients. 

To solve the problem for every k and every vector, there are two apparent approaches: 1) Find 
a better estimation of number of rational points on the hyper surf aces. 2) Explore the specialty 
of the hyper surf aces. From an average argument, it is attempting to conjecture that the vectors 
generated by polynomials of degree k are the only deep holes possible. If so, we can completely 
classify deep holes. We leave it as a open problem. 
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